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ABSTRACT 

In 1989 when functional programming was still considered a niche 
topic, Hughes wrote a visionary paper arguing convincingly “why 
functional programming matters”. More than two decades have pas- 
sed. Has functional programming really mattered? Our answer is a 
resounding “Yes!”. Functional programming is now at the forefront of 
a new generation of programming technologies, and enjoying incre- 
asing popularity and influence. In this paper, we review the impact 
of functional programming, focusing on how it has changed the way 
we may construct programs, the way we may verify programs, and 
fundamentally the way we may think about programs. 

Contact: hu@nii.ac.jp 


1 INTRODUCTION 

Twenty five years ago, Hughes published a paper entitled “Why 
Functional Programming Matters ” m, which has since become 
one of the most cited papers in the field. Rather than discussing what 
functional programming isn't (it has no assignment, no side effects, 
no explicit prescription of the flow of control), the paper empha- 
sizes what functional programming is. In particular, it shows that 
two distinctive functional features, namely higher order functions 
and lazy evaluation, are capable of bringing considerable impro- 
vement in modularity, resulting in crucial advantages in software 
development. 

Twenty five years on, how has functional programming mattered? 
Hughes’s vision has become more widely accepted. Main-stream 
languages such as C#, C++ and Java scrambled one after another to 
offer dedicated support for lambda expressions, enabling program- 
ming with higher-order functions. Lazy evaluation has also risen 
to prominence, with numerous papers on new ways to exploit its 
strengths and to address its shortcomings. 

One way to gauge the popularity of functional programming is 
through its presence at conferences both in academia and indu- 
stry. The ACM International Conference of Functional Program- 
ming grew to 500 participants in 2014. Developer conferences on 
functional programming abound — such as the Erlang User Confere- 
nce/Factory in Stockholm, London and San Francisco, Scala Days 
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and Clojure West in San Francisco, Lambda Jam in Chicago, Lam- 
bda Days in Krakow — all with hundreds of participants. Functional 
programming is also well represented nowadays at more general 
industry conferences such as GOTO in Aarhus, Strange Loop in St. 
Louis, and YOW! in Melbourne, Brisbane and Sydney, each with 
well over 1,000 delegates. 

Functional languages are also increasingly being adopted by indu- 
stry in the real worlc0 To name a few examples. Facebook uses 
Haskell to make news feeds run smoothly; WhatsApp relies on 
Erlang to run messaging servers, achieving up to 2 million conne- 
cted users per server; Twitter, Linkedln, Foursquare, Tumblr, and 
Klout use Scala to build their core infrastructure for sites. And while 
not using functional languages directly, Google’s popular MapRe- 
duce model for cloud computation was inspired by the map and 
reduce functions commonly found in functional programming. 

Generally speaking, functional programming is a style of pro- 
gramming: the main program is a function that is defined in terms 
of other functions, and the primary method of computation is the 
application of functions to arguments. Unlike traditional imperative 
programming, where computation is a sequence of transitions from 
states to states, functional programming has no implicit state and 
places its emphasis entirely on expressions (or terms). Functional 
programming focuses on what is being computed rather than how 
it is being computed, much like the way we define mathematical 
functions. As a simple example, consider a mathematical definition 
of the factorial function: 

0 ! = 1 

(n + 1)! = (n + l)n! 

Its definition in the functional language Haskell (60) has exactly the 
same structure: 

fac 0 =1 

fac (n + 1) = (n + 1) * fac n 

In contrast, with imperative programming, we would consider a 
state of (n, s) representing the current counter and the partial result, 
and show how to compute the final result by a sequence of state 
transitions from the initial state of (a;, 1). 


1 https://wd.ki. has k.e 11. org/Haskell_in_industry 
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n = x; 
s — 1 ; 

while (n>0) do { 
s = s*n 
n = n-1; 

} 

The contrast in style is apparent: functional programs are more 
declarative and often much shorter than their imperative counter- 
parts. This is certainly important. Shorter and clearer code leads to 
improved development productivity and higher quality (fewer bugs). 

But it is probably a more subtle difference, well hidden behind the 
overloaded use of the symbol ‘=’, that really sets the two apart. In 
the imperative program, *=’ refers to a destructive update, assigning 
a new value to the left-hand-side variable, whereas in the functional 
program '=’ means true equality: fac 0 is equal to 1 in any exe- 
cution context, and can be used interchangeably. This characteristic 
of functional programming (known as referential transparency or 
purity) has a profound influence on the way programs are constru- 
cted and reasoned about — a topic that will be covered extensively in 
Section[2] 

What this toy example does not illustrate is the use of higher 
order functions, a powerful abstraction mechanism highlighted in 
Hughes’s paper dSD. We will not repeat the message here. Instead, 
we will describe the concept of monads (in Section |3j, a design 
pattern for structuring computation that crucially depends on higher 
order functions. We will show how monads are used in developing 
domain-specific languages and taming side effects. In Section]?] we 
revisit the ideas of purity and higher order functions in the context of 
parallel and distributed computing, again showing the unparalleled 
advantages that they bring. Lastly, in Section [5] we briefly sketch 
the impact of functional programming in terms of influence on edu- 
cation and other programming languages, and real-world adoption. 
As a note to readers, this review paper is not intended to systemati- 
cally teach functional programming, nor to comprehensively discuss 
its features and techniques. They can be found in various textbooks 
on functional programming G3 E3 fH}. Rather, we aim to give 
a broad and yet focused view on how functional programming has 
mattered to software development, through showcasing advances in 
both academia and industry. 

Since functional programming is a style, in theory one could write 
functional programs in any language, but of course with vastly dif- 
fering levels of effort. We call a language functional if its design 
encourages or to some extent enforces a functional style. Promi- 
nent languages in this category include Haskell EOl . Erlang 
ML GH |84l, OCaml <83>, F# (06), Lisp ( fl08l fl03t . Scheme 63, 
Racket <39), Scala dD, and Clojure HU- In this paper, we mostly 
use Haskell, because Haskell is not only a dominant functional lan- 
guage, but also covers all the most important features of functional 
programming. 

2 CORRECTNESS OF PROGRAM 
CONSTRUCTION 

Today’s software systems are essential parts of our everyday lives, 
and their correctness is becoming ever more important; incorrect 
programs may not only cause inconvenience, but also endanger 
life and limb. A correct program is one that does exactly what its 
designers and users intend it to do. 


Obvious as it sounds, guaranteeing the correctness of programs, 
or even defining the meaning of correctness, is notoriously diffi- 
cult. The complexity of today’s software systems is often to blame. 
But the design of many programming languages in use today — the 
fundamental tools we use to build software — does not help either. 
Programs are expressed as sequences of commands returning a final 
result, but also at the same time updating the overall state of the 
system — causing both intended and unintended side effects. State 
update is just one of the many kinds of side effect: programs may 
throw exceptions, send emails, or even launch missiles as side- 
effects. For “convenience”, most languages allow such effects to 
be performed, without warning, anywhere in a program. 

The result is that in order to specify the complete correctness of 
any program, one has to describe the whole state of the system, and 
the unlimited possibilities of interacting with the outside world — an 
impossible task indeed. Just consider the task of testing part of a 
software system — perhaps a function called f . Before f can be exe- 
cuted, the tester must bring the system into the intended pre-state. 
After f has finished, the tester must check the outcome, which inclu- 
des checking that the system state is as expected. But in general, 
the system state is only partly observable, and even identifying the 
parts of the state which f changed is problematic. Much of the work 
of testing imperative software consists of setting up the right state 
beforehand, and observing the final state afterwards. 

Functional programming departs dramatically from this state of 
impediment by promoting purity: the result value of an execution 
depends on nothing other than the argument values, and no state may 
change as program execution proceeds. Consequently, it becomes 
possible to specify program behaviours independently of the rest of 
the system. For example, given a function that reverses a list (where 
[ ] represents the empty list and -H- appends two lists), we can state 
the following set of properties governing the function’s correctnes^] 

reverse [] = [ 

reverse [x] = [x] 

reverse ( ys -H- xs) = reverse xs 44- reverse ys 

reverse ( reverse xs) = xs 

As there are neither side effects nor outside influence, these laws 
(the first three) completely characterize the function reverse. Dra- 
wing an analogy with sworn testimony, the laws specify “the 
behaviour, the whole behaviour, and nothing but the behaviour”! 
The significance of this ability to claim that two expressions are 
equal (in any semantically observable way) is that one can now fre- 
ely replace variables by their values, and in general any expressions 
by their equals — that is, programs are referentially transparent. This 
freedom makes functional programs more tractable mathematically 
than their conventional counterparts, allowing the use of equational 
reasoning in the design, construction and verification of programs. 

This is just a toy example, but the underlying idea is far-reaching. 
Readers who are Linux users may have come across xmonac^ a 
tiling window manager for XI 1, known for its stability, xmonad 
is implemented in Haskell and relies on heavy use of semi-formal 
methods and program derivation for reliability; window manager 


2 As a notational convention, we use “=” to denote semantic equality 
to avoid confusion with the use of “=” (function definition) and “==” 
(comparison for structural equality) in Haskell. 

3 http://xmonad.org/ 
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properties (such as the behavior of window focus) are specified as 
equational laws, similar to the ones above, and exploited for testing 
using QuickCheck d26b . 

The holy grail of program correctness is to prove the absence of 
bugs. A landmark achievement in this respect is Leroy’s CompCert 
(74), an optimizing C compiler which is almost entirely proven cor- 
rect with the Coq proof assistant. Tellingly, when John Regehr tested 
many C compilers using his random C program generator CSmith, 
he found 79 bugs in gcc, 202 in LLVM. . . but only 6 in Compcert 
(722) . The middle-end bugs found in all other compilers were enti- 
rely absent from CompCert, and as of early 2011, CompCert is the 
only compiler tested with Csmith which lacks wrong-code errors. 
As in the case of xmonad, the use of purely functional immutable 
data structures played a crucial role in this unprecedented achie- 
vement. This idea of immutability is also appearing in application 
areas where people might least expect it. Datomiqj a fully tran- 
sactional, cloud-ready and distributed database, presents the entire 
database as an immutable value, leveraging immutability to achieve 
strong consistency combined with horizontal read scalability. 

In the rest of this section, we will see equational reasoning at 
work — in formal proofs of program correctness, in program testing, 
and in program optimization — how to associate algebraic properties 
to functional forms for program reasoning, how to automatically 
verify type properties, and how to structure and develop algebraic 
properties and laws for program derivation. 

2.1 Equational Reasoning 

We have already seen the use of equational properties as specificati- 
ons in the reverse example. Thanks to referential transparency, we 
are not only able to write the specifications, but also to reason with 
them. 

2.1.1 Correctness Proofs One way to make use of these equa- 
tions is in correctness proofs, just as in mathematics. Functional 
programs are often recursively defined over datatypes, lending 
themselves well to proofs by structural induction. For example, the 
reverse function we have already specified via equations can be 
defined in Haskell as follows: 

reverse [] = [] 

reverse ( x : xs) = reverse xs -H- \x\ 

The second equation says that, to reverse a list with the first element 
as x and the rest of the list as xs, we reverse xs and append x to the 
end of it. In fact, this equation is derivable from the third law for 
reverse by replacing ys by [ x\ and simplifying. 

Now suppose we wish to prove that the definition actually satisfies 
its specification, say the property that reverse is its own inverse: for 
any finite list xs, 

reverse ( reverse xs) = xs 

holds. In order to prove this property for any finite list xs, it is suf- 
ficient to show that (1) it holds when xs is the empty list [], and (2) 
if it holds for xs, then it also holds for x : xs — this is the induction 
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step. (1) is easy to show, and (2) can be confirmed as follows. 

reverse ( reverse (x : xs)) 

= { def. of reverse } 

reverse ( reverse xs -4+ [*]) 

= { law: reverse (xs 4+ ys) = reverse ys -H- reverse xs } 

reverse [x] -H- reverse reverse xs 
= { def. of reverse and inductive hypothesis } 

( reverse h-h-m) -H-as 
= { def. of reverse and simplification } 

x : xs 

Here, the law used in the above calculation can also be formally 
proven by induction. 

Inductive proof is very commonly used in the functional setting as 
an effective way to verify critical software systems. The inductive 
proof of the inverse property of reverse has a strong resemblance 
to that of the inverse property of complex encoding and decoding 
in real practice (48) . More practical application examples include 
using the Coq proof-assistant for proving the security properties of 
the JavaCard platform (23), certifying an optimizing compiler for C 
and formally proving computational cryptography (To). 

2.1.2 Property-based Testing Equational properties can also be 
used for testing functional programs. Using the QuickCheck test 
library (26), properties can be expressed via Haskell function defi- 
nitions. For example, the property of reverse stated in the previous 
section can be written as 

propReverseReverse :: [ Integer ] —¥ Bool 
propReverseReverse xs = 
reverse ( reverse xs) == xs 

If the property holds, then the corresponding function should always 
return True, so QuickCheck generates a large number of random 
argument values and checks that the function returns True for each 
one (the type stated for the property is needed to tell QuickCheck 
what kind of test data to generate, namely lists of integers). Qui- 
ckCheck defines a domain specific language, embedded in Haskell, 
for expressing properties in a testable subset of predicate calculus. 
Quantified variables, such as xs above, range over “sets” which 
are represented by test data generators, with fine control over the 
distribution of the randomly generated data. 

When a test fails, QuickCheck “shrinks” the test case to a minimal 
failing example. If we test the following (wrong) property, 

propReverse :: [Integer] — » Bool 
propReverse xs = reverse xs == xs 

then QuickCheck reports that [0,1] (or, occasionally, [1,0]) is a 
counterexample; the shrunk counterexample is obtained by search- 
ing for ways to simplify whatever randomly generated counterexam- 
ple is first found. We obtain [0, 1] because at least two elements are 
needed to make this property fail, and they cannot be equal — so 
[0, 0] is not a counterexample. Shrinking is of critical importance 
to make property-based testing useful; without it, the “signal” that 
causes a test to fail is drowned in the “noise” of randomly generated 
data, and debugging failures is far more difficult. 

Interestingly, this kind of testing finesses Dijkstra’s famous obje- 
ction that testing can never demonstrate the absence of bugs in 
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software, only their presence. If we test properties that completely 
specify a function — such as the four properties of reverse stated in 
the introduction — and if every possible argument is generated with 
a non-zero probability, then property-based testing will eventually 
find every possible bug. In practice this isn’t true, since we usu- 
ally do not have a complete specification, and we limit the size of 
generated tests, but in principle we can find any possible bug this 
way — and in practice, this approach to testing can be very effective, 
since generated tests can explore scenarios that no human tester 
would think to try. 

QuickCheck is heavily used in the Haskell community — it is the 
most heavily used testing package, and the 10 th most used package 
of any kind, in the Hackage Haskell package database. The core of 
xmonad, discussed above, is thoroughly tested by simple equatio- 
nal properties on the state of the window manager, just like those 
we have discussed. The basic idea has been ported to many other 
programming languages — FsCheck for F#, ScalaCheck for Scala, 
test.check for Clojure, but even non-functional languages like Go 
and lava. There is even a commercial version in Erlang, marke- 
ted by Quviq AB, which adds libraries for defining state machine 
models of effectful software 463}. This version has been used to find 
bugs in an Ericsson Media Proxy to track down notorious race 
conditions in the database software supplied with Erlang ((64), and 
to formalize the Basic Software part of the AutoSAR automotive 
standard, for use in acceptance testing of vendors’ code for Volvo 
Cars ©. In this last project, 3,000 pages of standards documents 
were formalized as 20,000 lines of QuickCheck models, and used 
to find more than 200 different defects — many of them defects in 
the standard itself. A comparison with conventional test suites sho- 
wed that the QuickCheck code was almost an order of magnitude 
smaller — despite testing more! 

2.1.3 Automatic Optimization The ability to replace expressions 
with their equals and be oblivious to the execution order is a huge 
advantage in program optimization. The Glasgow Haskell Compiler 
(GHC) uses equational reasoning extensively internally, and also 
supports programmer-specified rewrite rules (equational transfor- 
mations) as part of the source program (in pragmas) for automatic 
optimization. For instance, we could give GHC the inverse property 
of reverse to eliminate unnecessary double reversals of lists, and 
GHC will then apply the rule whenever it can. (While it is unlikely 
that a programmer would write a double reversal explicitly, it could 
well arise during optimization as a result of inlining other function 
calls). 

{-# RULES 

" reverse-inv" forall xs . 

reverse (reverse xs) = xs 

#-} 

In practice, people make use of this utility for serious optimizations. 
For example shortcut fusion mm is used to remove unnecessary 
intermediate data structures, and tupling transformation (ED is used 
to reduce multiple traversals of data. 

HERMIT (37) is a powerful toolkit for developing new opti- 
mizations by enabling systematic equational reasoning inside the 
Glasgow Haskell Compiler’s optimization pipeline. It provides a 
transformation API that can be used to build higher-level rewrite 
tools. 


2.2 Functional Forms 

Higher-order functions are not only useful for expressing programs, 
they can be helpful in reasoning and proofs as well. By associating 
general algebraic (equational) laws with higher order functions, we 
can automatically infer properties from these laws when the higher 
order functions are applied to produce specific programs. 

Two of the most important higher order functions are fold and 
unfold (also known as catamorphism and anamorphism). They 
capture two natural patterns of computation over recursive dataty- 
pes such as lists and trees: unfolds generate data structures and folds 
consume them. Here, we give them the name “functional forms” — 
they can be used as design patterns to solve many computational 
problems, and these solutions inherit their nice algebraic properties. 
In this review, we focus on fold and unfold on lists. In fact, a single 
generic definition of fold can be given for all (algebraic) datatypes 
t 80- 104 . 38 1 . and dually for unfold. 

2.2.1 Algebraic Datatypes We have seen an example of an alge- 
braic datatype, namely List, earlier on in the reverse example. List 
is the most commonly used datatype in functional programming — 
to such an extent that the first functional language was named 
Lisp {HD, as in “LISt Processing”. A list whose elements have 
the type a can be constructed by starting with the empty list Nil, 
and successively adding elements of type a to the list, one by one, 
using the data constructor Cons. 

data List a = Nil \ Cons a ( List a) 

For instance, the list [1, 2, 3, 4], of type List Int, is represented as 
follows: 

as = Cons 1 ( Cons 2 ( Cons 3 ( Cons 4 Nil))). 

The notations [ ] and infix : that we used above are simply shorthand 
for applications of Nil and Cons — Cons x xs can be written as 
x : xs. We will also use the “section notation” (surrounding an 
operator by parentheses) to turn a binary infix operator ® into a 
prefix function: (©) a b = (a©) b = (©6) a = a © b. 

2.2.2 Fold Foldr, which consumes a list and produces a value as 
its result, is defined as follows: 

foldr f e Nil = e 

foldr f e ( Cons x xs) = f x ( foldr f e xs) 

The effect of foldr f e xs is to take a list xs, and return the result of 
replacing Nil by e and each Cons by /. For example, foldr f e as 
converts the above list as to the value of 

/ 1 (/ 2 (/ 3 (/ 4 e))). 

This structurally inductive computation pattern captured by foldr is 
reusable; by choosing different / s and es, foldr can perform a vari- 
ety of interesting functions on lists. To take a few examples, sum 
sums up all elements of a list, prod multiples all elements of a list, 
maxlist returns the maximum element of a list, reverse reverses a 
list , map f applies function / to every element of a list, and inits 
computes all initial prefix lists of a list. In the definitions below we 
use partial applications of foldr. defining sum as foldr (+) 0 (with 
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two arguments rather than three) is the same as defining sum xs to 
be foldr (+) 0 xs. 


sum 

= foldr (+) 0 

prod 

= foldr (x) 1 

maxlist 

= foldr max (— oo) 

reverse 

= foldr f [ ] where f ar = r -H- [a] 

map g 

= foldr f [ ] where f ar = g a : r 

inits 

= foldr f [[]] where f ar = [] : map ( 


2.2.3 Unfold Unfoldr is the dual of foldr, which generates a list 
from a “seed”. It is defined as follows: 

unfoldr p f g seed = 
if p seed then [] 

else (/ seed) : unfoldr p f g (g seed) 

The functional form unfoldr takes a predicate p indicating when 
the seed should unfold to the empty list. When the condition fails to 
hold, then function / is used to produce a new list element as the 
head, and g is used to produce a new seed from which the tail of 
the new list will unfold. Like foldr, unfoldr can be used to define 
various functions on lists. For example, we can define the function 
downfrom n to generate a list of numbers from n down to 1: 

downfrom n = unfoldr isZero f g n 

where 

isZero n = n == 0 
fn = n 
g n = n — 1 

2.2.4 Composition of Functional Forms Foldr and unfoldr can be 
composed to produce clear definitions of computations, which we 
think of as specifications in this section, because of their declarative 
nature. For example, the following gives a clear specification for 
computing the maximum of all summations of all initial segments 
of a list: 

mis = maxlist o ( map sum) o inits 

It is defined as a composition of four functions we defined as foldrs 
before. 

Given the duality of unfold and fold (one generating data stru- 
ctures and the other consuming them), compositions of an unfold 
followed by a fold form very interesting patterns, known as hylo- 
morphisms (80; 50, j49>. A simple example of a hylomorphism is 
the factorial function: 

factorial n = prod ( downfrom. n) 

where prod is defined using foldr and downfrom using unfoldr. 

2.2.5 General Laws Functional forms enjoy many nice laws and 
properties that can be used to prove properties of programs that 
are in those forms. Fold has, among others, the following three 
important properties ISOk 

First, foldr has the following uniqueness property: 

foldr fi ei = foldr fc ei fi = /2 A ei = e 2 

which means that two foldrs are equivalent (extensionally), if and 
only if their corresponding components are equivalent. It serves as 
the basis for constructing other equational rules. 


Second, foldr is equipped with a general fusion rule to deal with 
composition of foldrs, saying that composition of a function and a 
foldr can be fused into a foldr, under the right conditions. 

h(f ar) = f a(h r) 
h o foldr f e = foldr f ( h e) 

Third, multiple traversals of the same list by different foldrs can 
be tupled into a single foldr, and thus a single traversal. 

h x = ( foldr fi ei x, foldr f 2 e 2 x) 
h = foldr f (ei, e 2 ) 

where f a (n, r 2 ) = (fi a n, /2 a r 2 ) 

Let us use a simple example to demonstrate how fusion is use- 
ful in the derivation of efficient programs. Assume that we have an 
implementation of insertion sort (into descending order) using foldr: 

sort = foldr insert [ ] 

where 

insert a [ ] = [a] 

insert a (b : x) = if a > b then a : b : x else b : insert a x 

Now suppose that we want to compute the maximum element of a 
list. This is easy to do using the existing sorting program: 

maxList = hd o sort 

where hd is a function to select the first element from a nonempty 
list (hd (a : x) = a) and return — oo if the list is empty. Though 
declarative and obviously correct, this program is inefficient. It is 
overkill to sort the whole list, just to get the head. Fusion, using the 
laws above, provides a standard way to solve this problem. The laws 
tell us that if we can calculate f' such that 

Va, x. hd ( insert a x) = f a (hd x) 

then we can transform hd o sort to foldr f (— oo). By instantiating 
a; as 6 : y and performing a simple calculation, we obtain /' as 
follows. 

f 1 ab = hd (insert a (b : y)) 

= if a > b then a else b 

Thus, we have derived a definition of maxList using a single foldr, 
which is exactly the same as the definition in Section [2. 2. 2| above. 
This fusion improves the time complexity of maxList from quadratic 
in the length of the list to linear. 

As we have seen, equational reasoning and higher order functions 
(functional forms) enjoy a symbiotic relationship: each makes the 
other much more attractive. 

2.3 Types 

What is the type of a function like foldrl In the examples above, we 
have already seen it used with several different types ! Its arguments 
are a function that combines a list element with an accumulator, 
an initial value for the accumulator, and a list of elements to be 
combined — but those list elements may be integers, lists themselves, 
or indeed any type; the accumulator can likewise be of any type. 
Being able to re-use higher-order functions like map and foldr for 
different types of data is a part of what makes them so useful: it is 
an essential feature of functional programming languages. Because 
of this, early functional languages did without a static type-checker; 
they were dynamically typed, but the compiler did not attempt to 
discover type errors. Some, like Erlang or Clojure, still are. 
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2.3.1 Polymorphic Types In 1978, Milner introduced polymor- 
phic types to ML, to solve this problem. The key idea is to allow 
types to include type variables, which can be instantiated to any 
type at all, provided all the occurrences of the same variable are 
instantiated to the same type. For example, the type of foldr is 

foldr :: (a —¥ 0 —¥ 0) —¥ 0 —¥ List a — > 0 

where a is the type of the list elements, and 0 is the type of the 
accumulator. When foldr was used to define sum above, then it 
was used with the type 

foldr :: ( Integer — > Integer — > Integer ) — > 

Integer —y List Integer —¥ Integer 

(in which a and 0 are both instantiated to Integer)', when it was 
used to define sort then its type was 

foldr :: (7 — > List 7 — » List 7) — » List 7 — > List 7 — » List 7 

(in which a is replaced by 7, the elements in the list to be sorted, and 
0 the accumulator type is a list of these). This allowed the flexibility 
of polymorphism to be combined with the security of static type- 
checking for the first time. ML also supported type inference', the 
type of foldr (and indeed, every other function) could be inferred 
by the compiler from its definition, freeing the programmer from 
the need to write types in function definitions at all. . . so ML was as 
concise and powerful as the untyped languages of its day, with the 
added benefit of static type-checking. 

Both these ideas have made a tremendous impact. Many functi- 
onal programming languages since, Haskell among them, have 
borrowed the same approach to polymorphic typing that ML pio- 
neered. Java generics, introduced in Java 5, were based directly on 
Odersky and Wadler’s adaption of Milner’s ideas to Java |9lfe ; simi- 
lar features appeared thereafter in C#. Type inference (in a more 
limited form) appeared in C# 3.0 in 2007, and type inference is 
increasingly available in Java too. 

In fact, ML’s polymorphic types also give us surprisingly useful 
semantic information about the functions they describe. For exam- 
ple, the reverse function can be applied to lists with any type of 
element, and so has the polymorphic type 

reverse : : List a -A List a 
This implies that it also satisfies 

V/, xs. map f ( reverse xs) = reverse (map f xs) 

as, indeed, does any other function with the same polymorphic type! 
These “free theorems”, discovered by Wadler ( fTT8b . are an applica- 
tion of Reynold’s parametricity §99\ : they are used, among other 
places, to justify the “short cut deforestation” optimisation in the 
Glasgow Haskell Compiler @3). 

2.3.2 Type Classes for Overloading Haskell’s major innovation, 
as far as types are concerned, was its treatment of overloading] 


5 Over the years, many innovations have been made in Haskell’s type 
system; here we refer to innovations in the early versions of the language. 


For example, the equality operator is overloaded in Haskell, allow- 
ing programmers to use different equality tests for different types of 
data. This is achieved by declaring an equality type class : 

class Eq a where 

(==) :: a — > a — > Bool 

which declares that the equality operator (==) can be applied to any 
type a with an instance of the class Eq\ programmers can define 
instances of each class, and the compiler infers automatically which 
instance should be used. 

Right from the start, Haskell allowed instances to depend on other 
instances. For example, (structural) equality on lists uses equality on 
the list elements, expressed in Haskell by defining 

instance Eq a=> Eq ( List a) where 
xs == ys = . . . 

That is, given an equality on type a, the compiler can construct an 
equality on type List a, using the definition of == in the instance. 
Or to put it another way, the compiler can reduce a goal to find an 
Eq ( List a) instance, to a goal to find an Eq a instance, for any 
a. This looks a lot like logic programming! Over the years, Haskell 
class and instance declarations have become more and more power- 
ful, incorporating aspects of both logic programming and functional 
programming, with the result that the type-checker is now Turing 
complet^] The important observation here is that Haskell’s class 
system gives us a programmable type checker, which — while it does 
not allow us to accept programs that will generate run-time type 
errors — does allow us to construct type systems for DSLs embedded 
in Haskell with exquisite precision. Scala’s implicits were directly 
inspired by Haskell’s type classes, and allow many of the same tricks 
to be played $92|. 

2.3.3 Types and Logic Finally, there is a deep connection betw- 
een types and logic. Just consider the type of the apply function: 

apply (a -¥ 0, a) —¥ 0 

If we read function arrow types as logical implications, and the pair 
type as a conjunction, then this type reads as the (true) proposition 

((A => B) A ,4) => B 

It turns out that this is not a coincidence: we can soundly regard 
apply as a proof of this property, and any type which is inhabited 
by some expression (in a sufficiently carefully designed functional 
language) corresponds to a provable property. Proof assistants such 
as Coq and Agda (90t are based on this correspondence, the 
Curry-Howard isomorphism, and enable users to prove theorems by 
writing programs. A key notion here is that of a dependent type, 
a type which depends on a value. For example, if the List type is 
parameterized not only on the element type, but also on the length 
of the list, then reverse can be given a more informative type 

reverse :: Vfc :: Nat , a :: Set. List k a -A List k a 

representing the fact that its result has the same length as its 
argument Predicates can then be represented by types that are 


6 Robert Dockins, The GHC typechecker is Turing-complete, 
https://mail.haskell.org/pipermail/haskell/2006-August/018355.html. 

7 Here Set is roughly speaking “the type of types”, or more precisely, the 
type of small types, i.e. excluding types such as Set itself. 
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sometimes empty. For example, Even k might be a type that is empty 
if k is odd, and non-empty if k is even — so constructing a term of 
type Even k proves that k is even. Using these ideas, it is possible 
to construct large functional programs with formal proofs of corre- 
ctness. Perhaps the most impressive example to date is CompCert 
( 3 ), already discussed above. 

2.4 Algebra of Programming 

In Section m we have seen a useful set of functional forms that 
capture common computation patterns while enjoying general alge- 
braic laws for program reasoning. This attractive idea can be carried 
much further, leading to the development of many specific theo- 
rems as building blocks for more complex reasoning. The result is 
the algebra of programming G2ED, where functional program- 
ming provides an algebraic framework for building programming 
theories for solving various computational problems by program 
calculation (derivation) — a systematic way of transforming speci- 
fication into efficient implementation through equational reasoning. 
This supports, in practice, Dijsktra’s argument that programming 
should be considered as a discipline of a mathematical nature (32). 

In this section, we review the programming theories that have 
been developed for constructing programs from specifications, and 
demonstrate how these theories can be used to derive programs in 
various forms for efficient sequential or parallel computation. 

2.4.1 Programming: Deriving Programs from Specifications 

Before addressing the solution of programming problems, con- 
sider the following mathematical problem known as the Chinese 
Chicken-Rabbit-Cage Problem: 

An unknown number of rabbits and chickens are locked in a 
cage. Counting from above, we see there are 35 heads, and 
counting from below, there are 94 feet. How many rabbits and 
chickens are in the cage? 

The degree of difficulty of this problem largely depends on which 
mathematical tools one has to hand. A preschool child may find 
it very difficult; he has no choice but to enumerate all possibili- 
ties. However, a middle-school student equipped with knowledge of 
equation solving should find it easy. He would let x be the number 
of chickens and y that of rabbits, and quickly set up the following 
problem specification: 

x + y = 35 
2x + 4j/ = 94. 

The rest is entirely straightforward. The theory of linear equations 
gives us a strategy for solving them systematically, to discover the 
values of the unknowns (i.e., x = 23, y = 12). 

We want to solve our programming problems this way too! We 
would like to have an algebra of programs: a concise notation 
for problem specification , and a set of symbol manipulation rules 
with which we may calculate (derive) programs from specifications 
by equational reasoning. Here, by “specification”, we mean two 
things: (1) a naive functional program that expresses a straightfo- 
rward solution whose correctness is obvious; and (2) a program in 
a specific form of composition of functional forms. By “program”, 
we mean an efficient functional program, which may be sequential, 
parallel, or distributed. 


2.4.2 Programming Theories Just like the specific laws develo- 
ped for factorization in solving equations, many laws (theorems) for 
deriving efficient functional programs from specifications have been 
developed Gins. They are used to capture programming princi- 
ples by bridging the gap between specifications and their implemen- 
ting programs. As an example, many optimization problems can be 
naively specified in a generate-and-test way: generating all the pos- 
sibilities, keeping those that satisfy the requirements, and returning 
one that maximizes a certain value: 

opt = maxlist o map value o filter p o gen 

To solve this kind of optimization problems using folds, many the- 
orems (HEO have been developed. One example is: if (1) gen, p, 
and value can be defined as foldrs, and (2) value = foldr (©) e, 
where © is associative and max is distributive over ©, then opt can 
be solved in linear time by a functional program in terms of a sin- 
gle foldr. With this theorem, solving optimization problems become 
easy: one just needs to specify the problem in the form described 
and the rest will follow! 

2.4.3 Programming Theory Development Many theories for the 
algebra of programming have been developed — but new ones can 
always be added. There is a general procedure to develop pro- 
gramming theories consisting of the following three major steps 
fU4ll54t. 

1. Define a specific form of programs, in terms of functional 
forms and their composition, that can be used to describe a 
class of interesting computations. 

2. Develop calculational rules (theorems) to bridge the gap betw- 
een the new specific form and the existing functional forms. 

3. Develop more theorems that can turn more general programs 
into the specific form to widen the application scope. 

The first step plays a very important role in this development. 
The specific form defined should not only be powerful enough to 
describe computations of interest, but also manipulable and suitable 
for the later development of calculational laws. 

2.4.4 Systems for Program Derivation Many systems have been 
developed for supporting program derivation and calculation. 
Examples are KIDS d 106b . MAG (29), Yicho (54), and so on. 
In general, such tools (1) support interactive development of pro- 
grams by equational reasoning so that users can focus on their 
creative steps, (2) guarantee correctness of the derived program 
by automatically verifying each calculation step, (3) support deve- 
lopment of new calculation rules so that mechanical derivation 
steps can be easily grouped, and (4) make the development pro- 
cess easy to maintain (i.e., the development process should be well 
documented.). 

Proof assistants and theorem provers can provide a cheap way to 
implement a system for program reasoning and program calculation 
(8811 1 16tl89l ). For instance, with Coq CD, a popular theorem prover, 
one can use Ltac, a language for programming new tactics, to build 
a program calculation system (TT6) : (1) Coq tactics can be used 
effectively for automatic proving and automatic rewriting, so that 
tedious calculation can be hidden with tactics; (2) new tactics can 
coexist with the existing tactics, and a lot of useful theories of Coq 
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are ready to use for calculation, and (3) tactics can be used in a 
trial-and-error manner thanks to Coq’s interaction mechanism. 

In contrast to deriving functional programs from formal spe- 
cifications, MagicHaskeller (I68j 169b is an interesting tool that 
automatically synthesizes functional programs from input/output 
examples. Although still under development, it has demonstrated 
the feasibility of the idea by synthesizing many interesting Haskell 
programs from just a small number of examples. 


3 STRUCTURING COMPUTATION 

In the previous section, we have experienced the liberating power 
of the absence of side-effects. Referential transparency is the basis 
of the wealth of properties that enabled us to reason, to test, to 
optimise, and ultimately to derive programs. On the other hand, 
functional programmers are by no means ascetics withdrawing from 
worldly pleasures. Functional programs are real programs! They 
do perform necessary computations with side effects such as I/Os, 
exceptions, nondeterminism, etc. 

The difference is simply a shift of perspective. Functional pro- 
grammers view side effects as costly, which shall be used with care, 
and at the best avoided. Let us suppose that we have to pay for 
each pair of parentheses in a game of writing arithmetic expressions. 
Associativity, which removes the need of parenthesising, becomes 
very valuable and players will be incentivized to write 3 x 0.5 x 5 
instead of (3 -F 2) x 5 for example. Programming is like such wri- 
ting of expressions, and the role of a programming language is 
to encourage good practises and at the same time discourage the 
opposite. 

Pure functional languages do just that: side effects are encoded 
by a programming pattern known as monads (85, 86, 119; 120). 
Since it is an encoding, purity is not threatened. But monads do 
enjoy a clearly distinguished syntax and type to encourage thought- 
ful design. We will see later in this section in more details the idea 
of monadic effects, and the benefit of being disciplined with the uses 
of them. As a matter of fact, the concept of monads as a program- 
ming pattern has applications far beyond handling side effects; it 
is fundamentally a powerful way of structuring computations and 
has been adopted in languages with built-in side effects! To name a 
few, some readers may have heard of async monad in F#, promises 
in JavaScript, or future in Scala, which are essentially imitations of 
a monad used to structuring asynchronous operations. On a diffe- 
rent front, Language INtegrated Query (LINQ) in .NET is directly 
inspired by monads, and a syntax sugar for them known as monad 
comprehension (following from set comprehension) in Haskell. 

In the sequel of this section, we will briefly review monads and 
demonstrate their uses in structuring computations, in the context of 
programming language development, which is itself a particularly 
successful application of functional programming. 

3.1 Monadic Composition 

Originally as a concept in category theory that was used by Moggi 
to modularise the structure of a denotational semantics <83 [86), 
monads are soon applied by Wadler to the functional implemen- 
tations of the semantics that Moggi describes, and as a general 
technique for structuring programs i ll 1911120) . 


A monad M consists of a pair of functions return and (3>=) 
(pronounced as “bind”). 

return :: a — > M a 

(»=) :: M a — F (a — > M ft) — ¥ M /3 

One shall read M a the type of a computation that returns a value 
of type a, and perhaps performs some side-effects. An analogy is to 
see a value of type M a as a bank card and its pin; they are not the 
same as plain cash a (of type a) which can be used immediately, but 
contain the instruction of producing an a, and in fact any number of 
as. 

We cannot use a value of type M a directly, but we can combine 
it with other instructions that use the result. For example, consider 
an evaluator of type Term — > Environment —¥ M Value where 
M is a monad. The evaluator does not give us a value directly, but 
we can bind its result to be used in subsequent computations. 

eval u e ^>= (A a — > 
eval v e ^>= A b — ¥ 
return ( Num ( a + b)) 

This expression evaluates the two operands of an addition and add 
them up. The result of the first evaluation is bound to the variable a 
and is passed to the second evaluation, and together with the second 
result (bound to b) they form parts of a new value which is again 
encapsulated in the monad by return. This pattern of structuring 
computation is very common and has been giving special syntax 
support in Haskell fll9) . The above expression can be rewritten as 
the following equivalent form: 

do a <— eval u e 
b <— eval v e 
return ( Num (a + 6)) 

which is seemingly similar to the following expression. 

let a = eval u e 
b = eval v e 
in Num (a + b) 

Now, let us say we want to extend the evaluator with error 
handling as in 

eval :: Term — > Environment — > Maybe Value 

data Maybe a = Just a \ Nothing 

where failures are represented by Nothing and successful compu- 
tations return values wrapped in Just. Without giving up on purity 
and resorting to a language with side-effects, the non-monadic defi- 
nitions which structure computations with let have to be tediously 
rewritten: every let binding now has to be made to perform a case 
analysis, separating successes from failures, and then decides the 
next step accordingly. 

Whereas for the monadic version, the story is rather different. The 
structure of the evaluator is independent of the underlying compu- 
tation that is used, and if carefully designed changing one entails 
minimum changes to the other. In the above, the specialising of 
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monad type M to Maybe links the right instances of return and 
^>= (in fact monads forms a type class). 

instance Monad Maybe where 
return x = Just x 

Nothing ~^>= f = Nothing 
Just x 3>= f = f x 

There is no change in the evaluator’s definition, as the handling 
of failure is captured as a pattern and localized in the definition 
of 3>=. In a similar manner, we can add to the evaluator real- 
world necessities such as I/Os, states or nondeterminism in a similar 
manner — simply replacing the underlying monad. 

Monads not only encapsulate side effects, but also make them first 
class. For example, the function sequence :: [m a] — > m [a] where 
m is a monad, evaluates each computation in turn, and collects the 
results. In the evaluator example, one may rewrite the expression for 
addition as 

UftM sum ( sequence [ eval u e, eval v e]) 

where sum adds up a list of numbers and is lifted (by UftM ) to 
handle monadic values. 

In addition to structuring programming languages semantics, 
monads also offer a nice framework for I/Os in a pure language. 
The idea is to view I/O operations as computations interacting with 
the “outside world” before returning a value: the outside world is 
the state, which (represented by a dummy “token”) is passed on to 
ensure proper sequencing of I/O actions ( f98l >. For example, a simple 
Read-Eval-Print Loop that reverses the command-line input can be 
written with recursion as a function of type 10 (), where 10 is the 
monad and () is a singleton type with () as its only value. 

repl :: 10 () 
repl = do 

line <— getLine 
putStrLn ( reverse line) 
repl 

Giving the stateful implementation of the 10 monad, it is obvi- 
ous that it could be used to support mutable states and arrays {98}, 
though a different approach based on normal state monad, which 
encapsulates effectful computations inside pure functions, may be 
considered better (EJUS. 

Further extensions of the IO includes the handling of concur- 
rency {97}. The implementation of software transactional memory 
(STM) with monads {47} is a clear demonstration of the bene- 
fit of carefully managed side-effects; two monads are used in the 
implementation: the STM monad structures tentative memory tran- 
sactions, the 10 monad commits the STM actions exposing their 
effects to other transactions, and a function atomic connects the 
two. 

atomic :: STM a — > IO a 

As a result of this explicit distinction of different type of effects 
made possible by monads, only STM actions and pure computations 
can be performed inside a memory transaction ruling out irrevocable 
actions by construction, and no STM actions can be performed out- 
side a transaction effectively eliminating a class of bugs altogether. 
Moreover, since reads from and writes to mutable cells are explicit 


as STM actions, the large number of other (guaranteed pure) compu- 
tations in a transaction are not tracked by the STM, because they are 
pure, and never need to be rolled back. All these guarantees make a 
solution based on monads very attractive indeed. 

Monads are often used in combination. For example an evalua- 
tor may need to handle errors and at the same time performing I/O, 
and it will be desirable to reuse existing monads instead of crea- 
ting specialised ones. The traditional way of achieving such reuse 
is through moving up the abstraction level to build monad transfor- 
mers fT09ll75l which are similar to regular monads, but instead of 
standing alone they modify the behaviours of underlying monads, 
effectively allowing different monads to stack up. The downside of 
monad transformers is that they are difficult to understand and are 
fragile to changes. Alternatives have been proposed and it is still an 
active area of research d 1071 1 101~1|701 >. 

Inspired by monads, other abstract views of computation have 
emerged, notably arrows {62} (93} and applicative functors {79}. 
Firstly proposed by Hughes (62), arrows are more general than 
monads allowing notions of computation that may be partially static 
(independent of the input) or may take multiple inputs. Applicative 
functors are a proper superset of monads, which has weaker pro- 
perties and thus more members. Similar to monads, both arrows 
and applicative functors can be used for structuring the semantics of 
EDSLs (58. 3J ). A theoretical comparison of the three can be found 
in {76} . 


3.2 Embedded Domain-Specific Languages 

So far, we have only discussed one way of developing languages 
that is to implement stand-alone compilers or interpreters. Functio- 
nal programming is well suited for the task and the use of monad has 
significant impact on modularity. However, the particular success 
of functional programming actually comes from another way of 
language development known as embedding {55l|56}. 

Languages produced through embedding are known as embedded 
languages, which are libraries in host languages. Giving that such 
libraries usually focus on providing functionalities of a specific pro- 
blem domain, the languages resulting from are seen as embedded 
domain-specific languages (EDSLs), while the host languages are 
usually general-purpose. The EDSL and its host language are one: 
the API of the library specifies a set of constructs of the new DSL, 
which at the same time shares the tool chain and the generic featu- 
res of the host language, such as modules, interfaces, abstract data 
types, or higher-order functions. Moreover, the EDSL implemen- 
tation is very “lightweight” — the EDSL designer can add features 
just by implementing a new function in the library, and can easily 
move functionality between the EDSL library and its clients. The 
ease of experimentation with such an EDSL helps implementors 
fine tune the design, and enables (some) end-users to customise the 
implementation with domain-specific optimisations. 

Such EDSLs have appeared in a wide spectrum of application 
areas, including compiler development, database queries, web appli- 
cations, GUIs, music, animations, parallelism, hardware descripti- 
ons, image processing, workflow and more. See Figure [T] for a few 
examples and {59} for a comprehensive listing. 

We have said that EDSLs are libraries. But obviously not all libra- 
ries are languages. So what is it that elevates EDSLs from their 
humble origin as libraries to the level of languages? 
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query = do oust <— table customers 

restrict ( cust\city . == . "London”) 
project {cust\ customer ID) 

(a) HaskellDB |73t for generating and executing SQL statements 

htmlPage content = 

( header <C (( thetitle <C " Mypage") 

-H+ ( script ! [thetype " text / javascript" , 
src ” http : //...”] <C ””) 

)) 4-H- ( body <C content) 

(b) (X)HTML G2} for producing XHTML 

mouseTurn g u = 

turn3 xVectorS y (turnS zVectorS (—x) g) 

where 

(x,y) = vector 2XYCoords ( n&mouseMotion u) 

mouseSpinningPot u = 

mouseTurn ( uhthColorG green teapot) u 

(c) Fran 03 for composing interactive, multi-media animations. 


Fig. 1: Code Fragments in Three EDSLs 


For a language to be worth its name, it must allow code fragments 
manipulating primitive data to be shared through defining reusable 
procedures. In an EDSL setting, the data (also known as the domain 
concepts) are themselves functions. For example in modelling 3D 
animation, the domain concept of behaviour , representing time- 
varying reactive values, is a function from time to a “normal” value. 
To have a language, we need to provide constructs that manipu- 
late such domain concepts. This is easy in functional languages: we 
can simply define higher-order functions called combinators , taking 
domain concepts as inputs and combining them into more complex 
ones, just like what procedures do to primitive data. For this rea- 
son, EDSLs are also known as combinator libraries — libraries that 
not just offer domain-specific operations, but also more importan- 
tly combinators that manipulate them. We illustrate this idea by 
developing an example EDSL. 

A Classic Example: Parser Combinators Our aim is to develop a 
parsing EDSL for the following BNF grammar. 

float ::= sign' digit + ('/ digit + )' 

Like any language, an EDSL has syntax and semantics. In this 
embedded setting, the syntax is simply the interface of the library 
containing the representation of the domain concept, and the opera- 
tions on it; and the semantics is the implementation of the operations 
in the host language. 

A parser is a program that receives a piece of text as the input, 
analyses the structure of the text, and produces an output usually 
in the form of trees that can be more conveniently manipulated by 


other parts of a compiler. A parser can be represented as a function. 

newtype Parser a = MkP ( String — > [(a, String)]) 

The type Parser is a new type, distinct but isomorphic to its underl- 
ying type of a function from strings to lists of parsing results, 
depending on how many ways the parse could succeed, or fail with 
an empty list. The parameterised type a is the tree produced by 
parsing, and is paired with the remaining unparsed string. In DSL 
terminology, Parser is the domain concept we are trying to model 
and as usual it has an underlying representation as a function. 

Primitive Parsers With a parser representation at hand, we can start 
building a library to manipulate it. To start with, we define some 
basic parsers. 

item :: Parser Char 
item = MkP f 

where 

/[] = H 

/ (c : cs) = [(c, cs)] 

The parser item consumes a character of the input string and returns 
it. And a second parser is a parser that always fails. 

zero :: Parser a 

zero = MkP f where / _ = [ ] 

These two are the only basic parsers that we will ever need, and all 
the power of the resulting language comes from its combinators. 

Parser Combinators For the grammar we have in mind, our 
language consists of the following set of basic combinators 

sat :: ( Char — / Bool) — > Parser Char 

plus :: Parser a — » Parser a — > Parser a 

optional :: Parser a -A Parser ( Maybe a) 

In the above, sat enriches item with a predicate so that only chara- 
cters satisfying the predicate will be parsed. By providing different 
predicates, it can be specialised to a number of different parsers. 

char x = sat (== x) 

digit = sat(Ax — > 'O' < x A x < '9') 

For example, digit succeeds with ” 123" and produces [(' f ” 23" )] 
as the outcome, but fails (returning []) with " A23" . Similarly, char 
only recognises the character that is passed to it as input and fails on 
all others. 

Parser plus p q combines the outcomes of applying p and q. In 
the case when one of the two fails, the outcome will be the same as 
using the other one. 

sign :: Parser Char 

sign = ( char , + / ) ‘plus 1 ( char '—') 

In the above, the prefix function plus is turned into an infix one by 
the surrounding left single quotes. 

The combinator optional corresponds to the (') notation in our 
grammar allowing an optional field to be parsed into Nothing when 
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it is not filled. Note that here Nothing represents success in parsing, 
not failure which is represented by the empty list. 

So far, what we have seen are combinators that are useful for con- 
structing parsers recognising individual components of a grammar. 
A crucial missing part is a way to sequence the individual compo- 
nents, so that individual parsers are chained together and applied in 
sequence. For example, our float grammar is a chain of sign, digits 
and fraction parsers. In another words, we need a way to structure 
the parsing computation. Sounds familiar? Monads introduced in 
the previous subsection fit the bill very well. 

Just like Maybe, Parser a is a monad with its own bind and 
return. 

(S>=) :: Parser a — ► (a — ► Parser /3) — > Parser fl 

return :: a — ¥ Parser a 

Now we are ready to define combinators that repeatedly apply a 
parser. 

zeroOrMore :: Parser a — » Parser [a] 
zeroOrMore p = 
do {x <— p; 

xs «— zeroOrMore p; 
return ( x : xs)} 

‘plus 1 return [] 


oneOrMore :: Parser a — » Parser [a] 
oneOrMore p = 

do 

xs <— zeroOrMore p 
return ( x : xs) 

Parser zeroOrMore p returns each application of p as a list. 
For example, applying ( zeroOrMore digit) to "12a" results in 
[(” 12 " , ” a”), (” 1 ” , ” 2a"), ” 12a”)]. Parser oneOrMore p is 

similar to zeroOrMore p except requiring p to apply at least once. 

Just to show that the use of monad has not prevented us from 
doing the same kind of equational reasoning as before, we list a few 
arbitrarily selected laws as an example. 


zero p 
zero ‘plus 1 p 
p ‘plus 1 ( q ‘plus 1 r) 
(p ‘plus 1 q) 3>= r 


V 

V 

(p ‘plus 1 q) ‘plus 1 r 

{p »= r) ‘plus 1 ( q »= r) 


Finally, we can directly translate the BNF grammar at the begin- 
ning of this subsection to the following executable parser. 


float = do 

sgn «— optional sign 
in «— oneOrMore digit 

frac <— optional (do {char' oneOrMore digit}) 
return ( mkFloat sgn in frac) 

We stop here and look at what we have learnt from this exercise. 
Readers interested in knowing more about parser combinators may 
start with Hutton and Meijer’s tutorials ( 1651 [66 1) for more combi- 
nators and techniques for improving efhciency, and several more 
papers on the subject fTT2ll2lll40ll231f7l). 


• Combinators are the key. The parser library we have developed 
is a language. It is not so much about what specihc operations 


are provided, but the unlimited possibilities through creative 
programming. As we can see, the very small number (two in 
our case) of basic parsers, which is typical in EDSLs, can be 
combined in different ways to produce highly complex pro- 
grams. All this power comes from the combinators which hinge 
on higher-order functions. 

• Monads are very useful. The monadic sequential combination 
is an excellent formulation of a recurring pattern: a sequence 
of operations are performed in turn and have their results com- 
bined in the end. We have relied on it to program parsers that 
goes beyond simply consuming a single character. 

When the semantics of the domain concepts is obvious, directly 
encoding the operations that can be performed on them often results 
in an elegant EDSL implementation. This is exactly what we did for 
the parser example: the set of combinators implements what we can 
do to parsers and the set is easily extensible by defining new com- 
binators. On the other hand, adding new domain concepts usually 
requires a complete reimplementation. Moreover with this direct 
approach, performance of the domain specific programs relies on 
the optimisation of the host language compiler, which is hard to 
predict and control. 

An alternative embedding technique is to generate code from 
the EDSL programs: the domain concepts are represented as an 
abstract syntax tree, and a separate interpretation function provi- 
des the semantics of the syntax constructs, resulting in a so called 
embedded domain specific compiler 03. In this case, the host lan- 
guage is used only at “compile time”; the EDSL programmer can 
use the full power of the host language to express the program, but at 
run-time, only the generated code need be executed. This approach 
is sometimes referred as deep embedding in contrast to the above 
more direct shallow embedding. The separation of phases in deep 
embedding provides opportunities for domain specific optimisati- 
ons as part of the interpretation function and adding new domain 
concepts simply means additional interpretation functions. On the 
other hand, extending the set of operations involves modifying the 
abstract syntax and existing interpretation functions. 

The two embedding approaches are dual in the sense that the 
former is extensible with regard to adding operations while the lat- 
ter is extensible with regards to adding concepts d42t . The holy 
grail of embedded language implementation is to be able to com- 
bine the advantages of the two in a single implementation — a 
manifestation of the expression problem Am Current research 
addresses the problem at two levels: exploiting sufficiently expres- 
sive host languages for a modular representation of the abstract 
syntax tree fiT3l 191 161 l20t. or combining the shallow and deep 
embedding techniques dTToi . Notably in |20k the technique also 
addresses another source of inefficiency with embedded languages 
namely the tagging and untagging required for manipulating the 
abstract syntax trees represented as datatypes. 

Lastly, the connection between EDSLs and monads may extend 
beyond the implementation level. It has become popular to include 
the monadic constructs as part of the EDSL surface language, which 
sparks interesting interactions with the underlying type system (95t 
110211111b 
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4 PARALLEL AND DISTRIBUTED COMPUTATION 

For a long time, parallel and distributed programming was seen as a 
specialized activity by special experts. This situation is currently 
changing extremely rapidly with pervasive parallel and distribu- 
ted environments such as multi-core/many-core hardware and cloud 
computing. With the Google’s MapReduce (1301), one can now easily 
write a program to process and generate large data sets with a paral- 
lel, distributed algorithm on a cluster. The MapReduce model is 
actually inspired by the map and reduce functions commonly used 
in functional programming. 

Functional programming languages offer a medium where pro- 
grammers can express the features of parallel algorithms, without 
having to detail the low-level solutions du [67l [77J. The high level 
of programming abstraction of function composition and higher 
order functions simplifies the task of programming, fosters code 
reuse and facilitates the development of substantially architecture- 
independent programs. The absence of side-effects avoids the 
unnecessary serialization which is a feature of most conventional 
programs. 

4.1 Parallel Functional Programming 

Functional languages have two key properties that make them attra- 
ctive for parallel programming: they have powerful abstraction 
mechanisms (higher order functions and polymorphism) for sup- 
porting explicit parallel programming known as skeleton parallel 
programming that abstracts over both computation and coordination 
and achieves the architecture-independent style of parallelism, and 
they have no side-effect that can eliminate unnecessary depende- 
ncies for easy parallelization. 

4.1.1 Skeleton Parallel Programming Parallel primitives (also 
called parallel skeletons G3 Hi) intend to encourage program- 
mers to build a parallel program from ready-made components for 
which efficient implementations are known to exist, making the 
parallelization process easier. 

The higher order functions discussed in Section [T2] can be regar- 
ded as parallel primitives suitable for parallel computation over 
parallel lists, if we impose some properties on the argument ope- 
rators. Three known data parallel skeletons map. reduce and scan 
can be defined as special instances of foldr op e. The definition of 
map has been given in Section and the definitions of reduce 
and scan are as follows: 

reduce op e = foldr op e 

scan op e = map ( foldr op e) o inits 

where op is an associative operator, and scan is defined as a compo- 
sition of map ( foldr op e) and inits. Note that reduce is different 
from foldr in that the operator op it can accept must be associa- 
tive, which restricts the type of reduce (the operator must combine 
operands of the same type), but allows the implementation greater 
freedom to choose an order of combination ( foldr always combines 
from right to left). The definitions above just define the semantics of 
reduce and scan, not necessarily their implementation. 

It has been shown that map, reduce and scan have nice massi- 
vely parallel implementations on many architectures QM} QHl. If 
k and an associative © use 0(1) parallel time, then map k can 
be implemented using 0(1) parallel time, and both reduce op e 
and scan op e can be implemented using O(log)V) parallel time 


(N denotes the size of the list). For example, reduce op can be 
computed in parallel on a tree-like structure with the combining ope- 
rator op applied in the nodes, while map k is computed in parallel 
with k applied to each of the leaves. The study on efficient parallel 
implementation of scan op e can be found in GB, which plays an 
important role in the implementation of parallel functional language 
NESL QHl. 

Just like foldr is the most important higher order function for 
manipulating lists sequentially, reduce plays the most important 
role in parallel processing of lists. If we can parallelize foldr as 
a parallel program using reduce, we can parallelize any sequen- 
tial function that is defined in terms of foldr (this again shows an 
advantage of structuring programs using general functional forms.) 
This has attracted a lot of work in developing programming theories 
to parallelize foldr { 52 I IMED ED- One known theorem for this 
parallelization is the so-called third homomorphism theorem @D, 
which shows that foldr op e can be parallelized as a composition 
of a map and a reduce if and only if there exists op such that the 
following holds. 

foldr op e = foldr op e o reverse 

In other words, this theorem says that a foldr is parallelizable if and 
only it can be written as a foldr on its reverse list. With this theorem, 
we can see that many of the functions defined in Section [2~2] such 
as sum , sort, maxlist and reverse, can be parallelized as a compo- 
sition of a map and a reduce. Importantly, this parallelization is not 
just a guide to programmers but can be done automatically 63. 

4.1.2 Easy Parallelization Purely functional languages have 
advantages when it comes to (implicit) parallel evaluation {HED. 
Thanks to the absence of side-effects, it is always safe to execute 
computations of subexpressions in parallel. Therefore, it is strai- 
ghtforward to identify the parallel task in a program, which would 
require complex dependency analysis when parallelizing imperative 
programs. 

Parallel Haskell provides two operators pseq and par for paralle- 
lization: pseq ei ei evaluates ei then e 2 in sequential order, and 
par ei e 2 is some kind of & fork operation, where ei is started 
in parallel with e 2 and the result of e 2 is returned. Consider the 
following normal Haskell function that implements the well-known 
Quicksort algorithm: 

sort [] = [] 

sort (x : xs ) = less -H- [x] -H- greater 

where 

less = sort [y \ y <— xs, y < x\ 
greater = sort [y\y 4— xs, y > x]. 

The following parallel version is just a little more complicated (with 
addition of the underlined codes); greater is computed in parallel 
with less by wrapping the original expression less -H- [a:] 4F greater 
with par greater and pseq less. 

parSort [] = [] 
parSort ( x : xs) = 
par greater ( pseq less 
( less 4+ [x] -H- greater )) 

where 

less = parSort [y \ y <— xs, y < x] 
greater = parSort [y\y xs, y > x\ 
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We can further control the granularity of the parallel task by switch- 
ing to the normal sort when the number of elements of list is shorter 
enough. 


parSort [] = [] 
parSort l@(x : xs) 

| shorter l = sort l 
| otherwise = 

par greater ( pseq less 
( less -H- [x] -H- greater)) 

where 

less = parSort [y \ y *— xs, y < x] 
greater = parSort \y \ y <— xs, y > x\ 


It is worth noting that parallel functional programs are easy to 
debug. Regardless of the order in which computations are execu- 
ted, the result of the program will always be the same. Specifically, 
the result will be identical to that obtained when the program 
is run sequentially. This implies that programs can be debugged 
sequentially, which represents a huge saving in effort. 


4.2 Distributed Functional Programming 

Distributed systems, by definition, do not share memory between 
nodes (computers) — which means that the imperative approach to 
parallel programming, with shared mutable data structures, is inap- 
propriate in a distributed setting. Instead, nodes communicate by 
copying messages from sender to receiver; copying a mutable data 
structure changes semantics, because mutation of one copy is not 
reflected in the other, but immutable data can be copied transparen- 
tly. This makes functional programming, with its immutable data 
structures, a natural fit. For systems which are designed to be sca- 
lable, in which overall system performance can be increased just by 
adding more nodes, it makes sense to use the same share-nothing 
communication between processes running on the same node, so 
that they may easily be deployed across multiple nodes as new nodes 
are added. Distributed systems can also offer high reliability, by 
using redundancy to tolerate failures, and in-service upgrade, by 
upgrading one node at a time while the others continue to deliver 
service. 

Erlang was designed at Ericsson in the late 1980s for building 
systems of this kind, originally for the telecom domain ©. Later, 
Erlang proved to be ideal for building scalable internet services, and 
many start-ups have used it as their “secret sauce”. The first of these 
was Bluetail AB, founded in 1998 to develop among others an SSL 
accelerator in Erlang, and sold for $140 million less than 18 months 
later; the most spectacular to date is WhatsApp, whose back-end 
servers are built in Erlang, sold to Facebook in 2014 for $22 billion. 

Erlang is a simple functional language with a slightly wordier 
syntax than Haskell; the factorial function defined in the introdu- 
ction would be written in Erlang as follows: 


fac( 0) — > 1; 

fac(N) when N > 0 — > N * fac(N — 1). 


qsort{[]) ->• []; 
qsort([X\Xs}) —¥ 

Parent = self (), 

Less = [Y || Y <- Xs,Y < X], 

Grtr = [ y 1 1 y -4— Xs, Y >= X ], 
spawn( fun() — » Parent ! {less, qsort(Less)} end), 
spauin(fun() — > Parent ! {grtr, qsort.(Grtr)} end), 
receive {less, SortedLess} — > 
receive {grtr, SortedGrtr} — > 

SortedLess TT [X] -H- SortedGrtr 
end 
end. 


Fig. 2: Parallel Quicksort in Erlang 


Erlang provides immutable lists and tuples, and LISP-like atoms, 
but no user-defined datatypes. Erlang lacks a static type systenj^ — 
a reasonable choice since dynamic code loading, necessary for in- 
service upgrades, is difficult to type statically to this day. 

To this functional core, Erlang adds features for concurrency and 
message passing. For example, Figure [2] presents a (not very effi- 
cient) parallel version of Quicksort in Erlang. This function uses 
pattern matching on lists to select between the case of an empty list 
and a cons ([.XjXs] means x : xs), and in the latter case uses list 
comprehensions to select the elements less than or greater than the 
pivot, then spawns two new processes to sort each sublist recursi- 
vely. Spawning a process calls the function provided (as an Erlang 
A-expression, fun() —>■... end) in the new process. Each of these 
processes sends the result of its recursive sort back to the parent pro- 
cess ( Parent ! . . .), using the parent’s process identifier, which is 
obtained by calling selfQ. Each result is tagged with an atom ( less 
or grtr), which allows the parent process to receive the results 
in the correct order — messages wait in the recipient’s “mailbox” 
until a matching receive removes them from it, so it doesn’t mat- 
ter in which order the messages from the child processes actually 
arrive. Erlang processes share no memory — they each have their 
own heap — which means that the lists to be sorted must be copied 
into the new process heaps. This is why we filter Xs to extract the 
less and greater elements before starting the child processes: it redu- 
ces the costs of copying lists. The advantage of giving each process 
its own heap is that processes can garbage collect independently 
while other processes continue working, which avoids long gar- 
bage collection pauses and makes Erlang suitable for soft real-time 
applications. 

Erlang adds mechanisms for one process to monitor another, 
turning a crash in the monitored process into a message delive- 
red to the monitoring one. These mechanisms are used to support 
fault-tolerance, with a hierarchy of supervisor processes which are 
responsible for restarting subsystems that fail; indeed Erlang deve- 
lopers advocate a “let it crash” approach, in which error-handling 
code (which is often complex and poorly tested) is omitted from 
most application code, relying on the supervisors for fault-tolerance 


8 Although many developers use Dialvzer llOOL a static analysis tool that 
can detect many type errors. 
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instead. Common patterns for building fault-tolerant systems are 
provided in the “Open Telecom Platform” libraries — essentially 
higher-order functions that permit fault-tolerant systems to be con- 
structed just by instantiating the application-dependent behaviour. 

Erlang’s approach to concurrency and distribution has been very 
successful, and has been widely emulated in other languages — for 
example, Cloud Haskell f36l provides similar features to Haskell 
developers. One of the best known “clones” is the Akka library for 
Scala l ll2l i which is used among others to build Twitter’s back-end 
services. 


5 FUNCTIONAL THINKING IN PRACTICE 

Functional programming has had big influences on education and 
other language design, and seen significant uses in industry. 

5.1 Education 

The style of teaching functional languages as first languages was 
pioneered by MIT in the 1980s, where functional language Scheme 
was taught using the famous textbook Structure and Interpretation 
of Computer Programs in the first course G». Now many univer- 
sities such as Oxford (Haskell) and Cambridge (ML) follow this 
functional-first style. In a recent survej]^] 19 out of top 30 Ameri- 
can universities in the US News 2014 Computer Science Ranking 
give their undergraduate students a serious exposure to functional 
languages. Compared to the other programming-first implementati- 
ons, the functional-first approach has the advantages of reducing the 
effect of diversity of students in background, letting students focus 
on more fundamental issues, think more abstractly, and touch ideas 
of recursion, data structure, functions as first class data earlier. In 
fact, one of the explicit goals of Haskell's designers was to create 
a language suitable for teaching. Indeed, almost as soon as Haskell 
was defined, it was being taught to undergraduates at Oxford and 
Yale. 

For learning the latest advanced functional programming techni- 
ques, there has been an excellent series of International Summer 
Schools on Advanced Functional Programming since 1995. Five 
such summer schools have been held so far in 1995, 1996, 1998, 
2002, and 2004, with all lecture notes published in Lecture Notes 
in Computer Science by Springer. For the new applications of 
functional programming, there has been another series of Summer 
Schools on Applied Functional Programming organized by Utrecht 
University since 2009. The two-week course covers applications 
of functional programming, concentrating on topics such as lan- 
guage processing, building graphical user interfaces, networking, 
databases, and programming for the web. 

For exchanging ideas of functional programming in education, 
there is a series of International Workshops on Trends in Functi- 
onal Programming in Education since 2012, where novel ideas, 
classroom-tested ideas and work-in-progress on the use of functio- 
nal programming in education are discussed among researchers and 
teachers. They have previously been held in St Andrews, Scotland 
(2012), Provo Utah, USA (2013), and Soesterberg, The Netherlands 
(2014). 


9 http://www.pl-enthusiast.net/2014/09/02/who-teaches-functional- 
programming/ 


5.2 Influences on Other Languages 

Ideas originated from functional programming such as higher-order 
functions, list structure, type inference etc. have made their way 
into the design of many modern programming languages, unleash- 
ing influences in an unostentatious manner. It is well conceivable 
that “main stream” programmers may be using functional features 
in their code without realising. 

One of the early examples of such is blocks in Smalltalk — a 
way of expressing lambda expressions and therefore higher-order 
functions. More recently, the development of C# is influenced by 
functional programmers working in Microsoft. The LINQ (Lan- 
guage INtegrated Query) features are directly inspired by monads 
and functional lists. Lambda expressions are introduced in C# 
3.5, but higher-order programming had been possible earlier on 
through delegates. Type inference is part of C#’s design and generics 
(polymorphism) are added in C# 2.0. 

Java’s generic type system introduced in Java 5 is based on ML’s 
Hindley-Milner type systems. Subsequent releases gradually intro- 
duced type inference, another feature that is usually associated with 
functional languages. Java 8 embraced functional programming by 
releasing a wealth of features that specifically aimed at facilitating 
such a programming style. It includes dedicated support for lam- 
bda expressions and the passing of functions as method arguments 
(i.e., higher-order functions), which is further made easy by the new 
feature of method references. 

C++ aboards the lambda expression train in C++1 1. A particular 
merit of this C++ feature, as opposed to lambdas in other impe- 
rative languages, is that it offers fine grained control over variable 
capture. Programmers are able to declare in their definitions whether 
the lambda bodies may refer to external variables (variables that are 
not formally declared as parameters) by value, by reference, or not 
at all — a step towards purity. 

Modern multi-paradigm languages often have good provision of 
functional features. Ruby is admitted by its inventor as having Lisp 
as its origin and blocks at its core. Additional lambda syntax is 
added in Ruby 1.9. Python adopted the list comprehension nota- 
tion and has support for lambda expressions. The Python standard 
library includes many functional tools imported from Haskell and 
Standard ML. Scala is an object-functional language that has full 
support for functional programming. In addition to a strong static 
type system, it also features currying, type inference, immutability, 
lazy evaluation, and pattern matching. 

Meijer’s reactive extensions (“RX”) GED enable .NET developers 
to manipulate asynchronous data streams as immutable collections 
with purely functional operations defined over them. RX simpli- 
fies the construction of event-driven reactive programs dramatically. 
The design has been copied in many other languages, perhaps most 
notably at Netflix, where RxJava is heavily used in the Netflix API. 


5.3 Uses in Industry 

It has often proven easier to adopt functional programming in small, 
new companies, rather than in large, existing software development 
organizations. By now, there have been many, many start-ups using 
functional programming for their software development. One of the 
first was Paul Graham’s Viaweb, which built a web shop system 
in Lisp, and was later sold to Yahoo. Graham’s well-known article 
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"Beating the Averages”[^]discusses Lisp’s importance as Viaweb’s 
“secret weapon”. He writes of studying competitors’ job adverts: 
“The safest kind were the ones that wanted Oracle experience. You 
never had to worry about those. You were also safe if they said they 
wanted C++ or Java developers. ... If I had ever seen a job posting 
looking for Lisp hackers, I would have been really worried.” 

The first company to use Haskell was Galois, who develop “high 
assurance” software, primarily under government contracts. Galois 
develop their code in Haskell — which in itself helps to ensure 
quality — but also use property-based testing, and the theorem pro- 
vers we have discussed in this paper. More recently, Facebook 
boasts its spam detecting and remediating system being the largest 
Haskell deployment currently in existence, actively and automati- 
cally fighting off vast amounts of undesirable content from reaching 
its users. At the heart of the system is a domain-specific language 
for writting the detection logic, firstly implemented in FXL (an in- 
house functional language) and is being migrated to Haskell at the 
moment. 

Erlang, with its roots in industry, has a strong start-up culture. The 
first Erlang start-up, Bluetail AB, was set up after Ericsson decided 
to discourage the use of Erlang internally. Many key members of the 
Erlang team left Ericsson to found a company focussing on scala- 
ble and highly reliable internet products; within 18 months Bluetail 
was sold to Alteon Web Systems for $150 million. The founders 
were soon starting new companies, including Klama AB (providing 
invoicing services to web shops in 7 European countries) and Tail-f 
(sold to Cisco in 2014 for $175 million). The highest-profile Erlang 
start-up to date, though, is WhatsApp, bought by Facebook in 2014 
for $22 billion. A tech blog at the time asked "How do you support 
450 million users with only 32 engineers? For WhatsApp, acquired 
earlier this week by Facebook, the answer is Erlang”. This illustra- 
tes nicely the benefits of productivity, scalability, and reliability that 
functional programming delivers. 

Functional programming has also found many users in the fina- 
ncial sector, thanks not least to Peyton-Jones et al’s seminal work 
on modelling financial contracts in Haskell (96). Traders need to 
evaluate all kinds of financial derivatives quickly, so they can decide 
at which price to buy or sell them. But new, ingenious derivatives 
are being invented constantly, forcing traders to update their evalua- 
tion software continuously. By providing combinators for modelling 
contracts, this task can be accelerated dramatically, bringing a sub- 
stantial advantage to traders who use them — the first trader able 
to evaluate a new derivative correctly stands to make a considera- 
ble profit. Credit Suisse was the first to use this technology, with 
the help of Augustsson, who now plays a similar role at Standard 
and Chartered — but these are far from the only examples. Functio- 
nal programming is also used for automated trading at Jane Street, 
whose systems are built in OCaml. The clarity and quality of OCaml 
code helps Jane Street ensure there are no bugs, which might rapidly 
lose large sums of money. 

Languages such as Scala and Clojure (which run on the JVM), 
and F# (which runs on .NET) aim to be less disruptive, by integra- 
ting with an existing platform. They are enjoying wide adoption; for 
example, Apache Spark, a popular open-source framework for Big 
Data analytics, is largely built in Scala. Nowadays there are succes- 
sful companies whose business idea is to support the adoption of 


functional programming by their customers: Erlang Solutions (for 
Erlang), Typesafe (for Scala), Cognitect (for Clojure), Well-typed 
and FP Complete (for Haskell). 

New applications of functional programming are appearing con- 
stantly. A good way to follow these developments is via deve- 
loper conferences such as the Erlang Factory, Scala Days, and 
Clojure/conj, and also via the annual conference on Commercial 
Applications of Functional Programming] held in association with 
ICFP. 


6 CONCLUSION 

Twenty-five years ago, functional programming was high in the sky, 
favored only by researchers in the ivory tower. Twenty-five years 
on, it has touched down on the ground and had a wide impact on the 
society as a new generation of programming. What would functional 
programming be in twenty-five years? 
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